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FINAL  REPORT 

MULTISTRATEGY  LEARNING  FOR  COMPUTER  VISION 
GRANT  NUMBER  F49620-95-1-0424 
PI:  BirBhanu 
UC  Riverside 


1.  SUMMARY 

This  final  report  describes  the  work  that  has  been  performed  under  the 
DARPA/AFOSR  grant  F49620-95-1-0424  on  “Multistrategy  Learning  for 
Computer  Vision,”  during  the  period  from  July  1,  1995  to  June  30,  1998.  In  the 
following  we  present  a  summary  of  objectives  and  accomplishments  achieved 
during  the  course  of  the  program.  Selected  papers  published  during  the  course  of 
the  program  are  attached  with  this  report. 

2.  OBJECTIVES 

Current  lU  algorithms  and  systems  lack  the  robustness  to  successfully  process 
imagery  acquired  under  most  real-world  scenarios.  They  do  not  provide  the 
necessary  consistency,  reliability  and  predictability  of  results.  Robust  3-D  object 
recognition  remains  one  of  the  important  but  elusive  goals  of  lU  research  for 
practical  applications.  With  this  goal  of  achieving  robustness,  our  research  at  the 
University  of  California  at  Riverside  (UCR)  is  directed  towards  learning 
parameters,  feedback,  contexts,  features,  concepts,  and  strategies  of  lU  algorithms 
for  model-based  object  recognition. 

Our  multistrategy  learning-based  lU  approach  selectively  applies  machine 
learning  techniques  in  innovative  ways  at  multiple  levels  to  achieve  robust 
recognition  performance.  At  each  level,  appropriate  evaluation  criteria  are 
employed  to  monitor  the  performance  and  self-improvement  of  the  system. 

The  results  of  our  research  are  being  applied  in  automatic  target  recognition, 
autonomous  navigation,  and  image  and  video  databases. 


1 


3.  MAJOR  ACCOMPLISHMENTS/NEW  FINDINGS 


A.  CLOSED-LOOP  IMAGE  UNDERSTANDING  SYSTEMS 
(Documents  #2,  3  &  7) 

Robustness  of  an  lU  system  can  be  enhanced  using  feedback.  However, 
how  to  control  feedback  in  a  multi-level  lU  system  has  been  a  long-standing 
problem  in  the  field  of  computer  vision  and  pattern  recognition.  We  have 
developed  reinforcement  learning-based  techniques  that  show  promise  in 
approaching  this  problem  [please  see  attached  documents  2  and  3] . 

Our  theoretically  sound  approaches  to  control  feedback  use  the  results  of 
recognition  to  learn  segmentation  and  feature  extraction  parameters  for 
robust  model-based  recognition.  They  are  based  on  the  use  the  team  of 
learning  automata  algorithm  and  the  delayed  reinforcement  learning 
algorithm. 

The  closed-loop  object  recognition  system  evaluates  the  performance  of 
segmentation  and  feature  extraction  by  using  the  recognition  algorithm  as 
part  of  the  evaluation  function.  Recognition  confidence  is  used  as  a 
reinforcement  signal  to  the  image  segmentation  or  feature  extraction 
processes.  By  using  the  recognition  algorithm  as  part  of  the  evaluation 
function,  the  system  is  able  to  develop  recognition  strategies  automatically, 
and  to  recognize  objects  accurately  in  newly  acquired  images.  As  com¬ 
pared  to  the  genetic  algorithm  based  techniques  that  we  have  developed 
earlier  which  simply  search  a  set  of  parameters  that  optimize  a  prespecified 
evaluation  function,  here  we  have  a  recognition  algorithm  as  part  of  the 
evaluation  function. 

Using  the  Phoenix  algorithm  for  the  segmentation  of  color  images,  a 
clustering-based  algorithm  for  the  recognition  of  occluded  2-D  objects  and  a 
team  of  learning  automata  algorithm,  or  a  delayed  reinforcement  learning 
algorithm,  we  show  that  in  simple  real  scenes  with  varying  environmental 
conditions  and  camera  motion,  effective  low-level  image  analysis  and 


2 


feature  extraction  can  be  performed.  We  show  the  performance 
improvement  of  an  lU  system  combined  with  learning  over  an  lU  system 
with  no  learning. 

The  results  of  this  research  are  being  used  for  model-based  recognition  of 
targets  in  SAR  images  acquired  under  extended  operating  conditions  (please 
see  “Adaptive  Target  Recognition  Using  Reinforcement  Learning,”  by 
Bhanu,  Lin,  Jones,  and  Peng  (DARPA  IUW98)).  They  have  also  been 
applied  to  the  problem  of  autonomous  navigation  (please  see  attached 
document  #7) . 

B.  LEARNING  BASED  INTEGRATED  RECOGNITION  AND  SEGMENTATION 
(Document  #4) 

We  have  developed  a  general  approach  to  image  segmentation  and  object 
recognition  that  can  adapt  to  the  changing  environmental  conditions.  It 
allows  the  automated  acquisition  of  recognition  strategies  in  dynamic 
environments.  The  learning  paradigm  used  here  is  reinforcement 
learning,  same  as  in  A.  above.  Incorporation  of  domain  knowledge  into  a 
reinforcement  learning  paradigm  and  its  efficient  implementation  are 
important  challenges  posed  by  computer  vision  applications.  We  have  used 
the  edge-border  coincidence  for  both  local  and  global  segmentation 
evaluation.  However,  since  this  measure  is  not  reliable  for  object 
recognition,  it  is  used  in  conjunction  with  model  matching  in  a  closed-loop 
object  recognition  system.  Segmentation  parameters  are  learned  using  a 
reinforcement  learning  algorithm  that  is  based  on  a  team  of  learning 
automata  and  uses  edge-border  coincidence  or  the  results  of  model 
matching  as  reinforcement  signals.  The  performance  improvements  are 
shown  in  the  attached  document  #4. 
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C.  SCALABILITY  OF  GENETIC  LEARNING  FOR  ADAPTIVE  SEGMENTATION 


(Document  #8) 

The  problem  is  to  learn  algorithm  parameters,  develop  algorithms  and 
evaluation  criteria  for  multisensor  image  segmentation  and  recognition 
from  images  acquired  under  varying  environmental  conditions.  We  have 
developed  techniques  based  on  genetic  learning  and  other  hybrid  methods 
,  such  as  a  combination  of  genetic  algorithms  and  hill  climbing. 

Our  initial  research  using  outdoor  video  imagery  and  the  Phoenix 
algorithm  has  demonstrated  that  (a)  adaptive  image  segmentation  can 
provide  over  30%  improvement  in  performance,  as  measured  by  the  quality 
of  segmentation,  over  non-adaptive  techniques,  and  (b)  learning  from 
experience  can  be  used  to  improve  the  performance  over  time.  In  our 
current  work,  we  show  that  our  approach  scales  with  respect  to  the  number 
of  parameters  and  the  size  of  the  search  space.  Genetic  learning  combined 
with  a  hill-climbing  technique  is  able  to  adaptively  select  good 
segmentation  parameters  and  to  generate  the  best  result  using  the  least 
number  of  segmentations.  In  experiments  designed  to  evaluate  the 
scalability  of  our  approach  we  find  that  for  the  case  of  a  four  Phoenix 
parameter  set  we  only  search  about  0.5%  of  the  (I  million  size)  search  space. 

D.  LEARNING  TO  INTEGRATE  CONTEXT  WITH  CLUTTER  MODELS 
(Document  #10) 

The  problem  is  to  integrate  contextual  information  with  clutter  models  for 
target  detection  and  recognition.  Current  image  metrics  commonly  used  to 
characterize  images  do  not  correlate  well  with  the  performance  of  target 
recognition  systems. 

The  contextual  parameters,  which  describe  the  environmental  conditions 
for  each  training  example,  are  used  in  a  reinforcement  learning  paradigm 
to  improve  the  clutter  models  and  enhance  target  detection  performance 
under  multi-scenario  situations.  New  Gabor  transform-based  features  and 
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other  statistical  image  features  are  used  to  capture  the  statistical  properties  of 
natural  backgrounds  in  visible  and  FLIR  images.  The  non-incremental  self¬ 
organizing  map  approach  commonly  used  in  an  unsupervised  mode  is 
extended,  by  the  addition  of  a  near-miss  injection  algorithm,  and  used  as  an 
incremental  supervised  learning  process  for  clutter  characterization. 

A  fast  algorithm  to  compute  the  Gabor  transform  of  a  given  image  has  been 
implemented.  We  have  implemented  two  new  Gabor  transform-based 
feature  groups  and  tested  their  classification  performance  on  natural 
backgrounds.  Experimental  results  show  that  the  two  feature  groups  could 
capture  certain  characteristics  of  the  backgrounds,  which  are  consistent  with 
our  theoretical  expectations  based  on  the  physical  meaning  of  each  attribute 
within  the  feature  group.  Using  second  generation  FLIR  images,  four 
contextual  parameters  (time  of  the  day,  depression  angle,  range  to  the  target 
and  air  temperature)  and  5  feature  groups,  we  find  100%  detection  rate,  10% 
false  alarm  rate  and  significant  improvement  in  the  confidence  for 
classifying  a  feature  cell  (rectangular  regions  in  an  image)  as  a  clutter  or  a 
target. 

E.  INPUT  ADAPTATION  USING  MODIFIED  HEBBIAN  LEARNING 
(Document  #9) 

The  problem  is  to  improve  the  performance  of  an  lU  algorithm  by  adapting 
its  input  data  to  the  desired  form  so  that  it  is  optimal  for  the  given  algorithm. 

The  two  general  methodologies  for  the  performance  improvement  of  an  lU 
system  are  based  on  optimization  of  algorithm  parameters  and  adaptation  of 
the  input.  Unlike  the  genetic  learning  case  for  adaptive  image 
segmentation,  here  we  focus  on  the  second  methodology  and  use  modified 
Hebbian  learning  rules  to  build  adaptive  feature  extractors  which  transform 
the  input  data  into  the  desired  form  for  a  given  algorithm.  Learning  rules 
are  based  on  different  loss  functions  and  are  suitable  for  extracting 
expressive  or  discriminating  features  from  the  input. 
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The  feasibility  of  the  approach  is  shown  by  designing  an  input  adaptor  for  a 
thresholding  algorithm  for  target  detection  using  SAR  and  FLIR  images. 
The  results  are  excellent  with  the  input  adaptor,  compared  to  the  case  with 
no  input  adaptor. 

F.  SYSTEM  FOR  AIRCRAFT  RECOGNITION 
(Documents  #6  &  5) 

We  developed  an  lU  system  for  aircraft  recognition.  The  complete  report  on 
the  system  with  its  capabilities  and  limitations  are  described  in  document 
#6.  Document  #5  describes  a  multistrategy  learning  based  system  for 
aircraft  recognition.  We  also  investigated  the  development  of  a  case-based 
reasoning  approach  for  learning  strategies  for  model-based  recognition. 

G  COMPREHENSIVE  PAPER 
(Document  #1) 

Wrote  and  refined  a  comprehensive  paper  on  applying  learning  techniques 
to  computer  vision  problems. 
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